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ABSTRACT 

The Fourier transform of a dataset apodised with a window function is known as the 
Gabor transform. In this paper we extend the Gabor transform formalism to the sphere 
with the intention of applying it to CMB data analysis. The Gabor coefficients on the 
sphere known as the pseudo power spectrum is studied for windows of different size. 
By assuming that the pseudo power spectrum coefficients are Gaussian distributed, 
we formulate a likelihood ansatz using these as input parameters to estimate the full 
sky power spectrum from a patch on the sky. Since this likelihood can be calculated 
quickly without having to invert huge matrices, this allows for fast power spectrum 
estimation. By using the pseudo power spectrum from several patches on the sky 
together, the full sky power spectrum can be estimated from full-sky or nearly full-sky 
observations. 

Key words: methods: data analysis-methods: statistical-techniques: image processing 
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1 INTRODUCTION 

The Cosmic Microwave Background (CMB) is one of our most important sources of information about the early universe 
(Bond 1995; Jungman et al. 1996) ; Hu, Sugiyama & Silk 1997; Durrer 2001). The pattern of the temperature fluctuations in 
the CMB contains information about a number of cosmological parameters. If the temperature fluctuations are Gaussian as 
predicted by most models of the early universe, all this information is stored in the angular power spectrum coefficients Ce. For 
this reason, several experiments have been conducted to measure the CMB power spectrum. The COBE satellite discovered 



the fluctuations in 1992 ( 


G. F. Smoot et al. 1992), and since then several ground based and balloon borne experiments ( 


De 


Bernardis et al. 200C; 


Hanany et al. 2000 




Netterfield et al. 2001 




Lee et al. 2001 




Halverson et al. 2001, Pryke et al. 2001) 



have been made to study the CMB at an ever increasing resolution. As the amount of CMB data from these experiments is 
rapidly growing, the task of extracting the power spectrum from the data is getting harder. 



Analysing the CMB data from a given experiment consists of several steps as the data consists of several components not 
belonging to the CMB (Maino et al. 2001; Stolyarov et al. 2001). In this paper, we will concentrate on extracting the power 
spectrum from a CMB map with foregrounds removed. The standard method of extracting the power spectrum from a sky 
map is the method of maximum likelihood. This method gives the smallest error bars on the power spectrum estimates, but 
has the drawback that the number of operations needed to perform the estimation, scales as iVp ix , where iV p j x is the number 
of pixels in the map. For experiments with high resolution the number of pixels can be up to several million and this method 
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becomes infeasible using current computers (Borrill 1999) 



In (Oh, Spergel & Hinshaw 1999), it is shown how the likelihood analysis can be speeded up to scale as -/V pix with as- 



sumptions about azimuthal symmetry and uncorr elated noise. Another meth od for large azimuthally symmetric parts of 
the sky with uncorrelated noise was presented in (Wandelt, Gorski & Hivon 2000). The likelihood problem can also be solved 



exact in iVpj x o perations with correlated noise for special scanning strategies as demonstrated in (Wandelt 2000 



Ha lsen 2001). In (Bond 1995; Bond, Jaffe & Knox 2000; Bartlett et al. 2000) it is shown how one can approximate the 



Wandelt 



likelihood to speed up the calculations, but still an N^ ix operation is needed. This has led people to find other estimators 



than the maximum likelihood estimator to extract the power spectrum. In (Tegmark, Taylor & Heavens 1997) an optimal 



estimator was found but the calculation scales as A^p ix times a huge prefactor. Recently some near optimal estimators have 
been found which can be calculated in JVp ix operations (Pore, Knox & Peel 2001; Szapudi et al. 2000; Hivon et al. 2002) The 



data fro m the BOOMERA NG (De Bernardis et al. 2000 



Netterfield et al. 2001) experiment was analysed using the MASTER 



method ( Hivon ct al. 2002 ). In this method, the power spectrum was extracted by a quadrat ic estimator base d on the pseudo 
power spectrum (the power spectrum on the cut sky). A similar method was suggested by ( Balbi et al. 2002 ) for the Planck 
s urveyor. Here we propose to us e the pseudo power spectrum {Ce) for likelihood estimation. This principle was also used in 
(Wandelt, Gorski & Hivon 200C) but then for large sky coverage so that the correlations between the Cg coefficients could be 
neglected. 



In this paper, we study the effect of Gabor transforms on the sphere. Gabor transforms, or windowed Fourier transforms 
are just Fourier transforms where the function f(x) to be Fourier transformed is multiplied with a Gabor window W(x) ( |Gaboi 
19460 In the discrete case f{xi) can be a data stream. If parts of the data stream are of poor quality or is missing, this can 
be represented as W(xi)f(xi) where the window W is zero where there are missing parts. The window can also be formed so 
that it smoothes the edges close to the missing parts and in this way avoid ringing in the Fourier spectrum. 



We will study the effect of Gabor transforms on the sphere and use it for fast CMB power spectrum estimation. The 
Gabor transform in this context is just the multiplication of the CMB sky with a window function before using the spherical 
harmonic transform to get the Gabor transform coefficients in this case called the pseudo power spectrum. The window can 
be a top-hat to take out certain parts of the sky in the case of limited sky coverage. Another window can be a Gaussian 
Gabor window for smoothing the transition between the observed and unobserved area of the sky. The Gabor window can 
also be designed in such a way as to increase signal-to-noise by giving pixels with high signal-to-noise higher significance in 
the analysis. The use of the windowed Fourier transform was already studied in (Hobson & Magueijo 1996) in the flat-sky 
approximation. We show that some of their results are also valid on the sphere. 



In the standard likelihood approach of power spectrum estimation, the pixels on the CMB sky or the spherical harmonic 
coefficients ai m are used as elements in the data vector in which case the correlation matrix will have dimensions of the order 
Af p ix x Afpi x . A matrix of this size can not be inverted in a reasonable amount of time with current computers. We propose 
to use the pseudo power spectrum coefficients Gi as elements of the data vector in the likelihood. In this case the size of the 
correlation matrix will at most be Z max X Z max which can be inverted in a few seconds. The most time consuming part is the 
calculation of the elements of the correlation matrix of pseudo-CV 

In Section (^ we will first describe the one dimensional Gabor transform and then define the Gabor transform on the 
sphere. We will define the pseudo power spectrum which is just the Gabor coefficients on the CMB sky. The kernel relating 
the full sky power spectrum and the pseudo power spectrum for a Gaussian and top-hat Gabor window will be discussed. 
Then in Section (^) we will use the pseudo power spectrum as input values to a maximum likelihood estimation of the full sky 
power spectrum. The probability distribution of the pseudo power spectrum coefficients will be assumed Gaussian and we will 
show that this is a good approximation at high multipoles (£ > 100). Some examples of likelihood estimations of the power 
spectrum with different noise patterns will be shown. In Section (Q) two extensions of the method will be discussed. First the 
use of the pseudo power spectrum from different Gabor windows centred at different points on the sphere simultaneously is 
demonstrated. In this way full-sky or nearly full-sky observations can be analysed. The second extension of the method is the 
use of Monte Carlo simulations to obtain noise properties in the case where this is faster than using the analytic expression 
or where the noise is correlated. Finally in Section (@) the results and further extensions are discussed. 



2 THE GABOR TRANSFORMATION AND THE TEMPERATURE POWER SPECTRUM 

In this section we will first describe the Gabor transform for functions on a one dimensional line. Then we extend the 
formalism to functions on the sphere and the properties of the Gabor transform coefficients on the CMB sky, the pseudo-CV, 
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are discussed. As most CMB experiment will not be able to observe the full sky, it is important to study the properties of the 
power spectrum on the sky apodised with a window function. As we will show later, the best way to construct the window is 
not always to set it to 1 in the observed area and to in the non-observed area of the sky. For this reason we will study the 
Gabor transform for windows with different profiles. On the cut sky the pseudo power spectrum coefficients will get coupled 



(Wandelt, Gorski & Hivon 2000; Hivon et al. 2002). We will study how strong this coupling is for different window sizes and 



for different windows. We will in particular study the top-hat and the Gaussian windows. The top-hat window is important, 
as it is the window which preserves most of the information in the observed data set. The Gaussian window smoothes of the 
edges between the observed and unobserved areas of the sky and in this way cuts off long range correlations between pseudo 
CV 



2.1 The one dimensional Gabor transform 

For a data set dj with N elements, the normal Fourier transform is defined as, 

d h = J2d^ k/N . (1) 

3 

A tilde on d shows that these are the Fourier coefficients. The inverse transform is then, 

k 

Sometimes it is useful to study the spectrum of just a part of the data set. This could be if some parts are of poor quality 
or the spectrum is changing along the data set. In this case, one can multiply the data set with a function, removing the 
unwanted parts and taking out a segment to be studied. The function can be a step function cutting out the segment to study 
with sharp edges or a function which smoothes the edges of the segment to avoid ringing (typically a Gaussian). 



The Fourier transform with such a multiplication was studied by Gabor ( Gabor 1946 ) and is called the Gabor Transform. 
It is defined for a segment centred at j = M and with wavenumber k as, 

dkM = djGj^M e l27rjfc/jv . (3) 

3 

Here G 3 -m is the Gabor window, the function to multiply the data set with. The transform is similar to the Wavelet transform. 
The difference is that the window function in the Wavelet transform is frequency dependent so that the size of the segment 
is changing with frequency. 

Analogously to the Fourier transform, there is also an inverse Gabor transform. To recover the whole data set from a 
Gabor transform, one needs the Fourier coefficients taken with several windows G 3 -m being centred at different points M. 
This means that the data set has to be split into several segments. The centre of each segment is set to M = mK where K 
determines the density of segments and m is an integer specifying the segment number. One then has for the inverse transform 

d 3 = d km g km . (4) 

m k 

Due to the non-orthogonality of the Gabor transform, the dual Gabor window gkm is not trivial to find, but several techniques 
have been developed for calculating this dual window (e.g. ( [gtrohmer 1997 ) and references therein). 



In this paper we will study the Gabor transform on the sphere and apply it to CMB analysis. We will take out a disc on 
the CMB sky, using either top-hat or Gaussian apodisation and then derive the pseudo power spectrum Ce on the apodised 
sky. The Ce will be used for likelihood estimation of the underlying full sky power spectrum. We also show how several discs 
(segments) centred at different points can be combined to yield the full sky power spectrum. 

2.2 Gabor transform on the sphere 

We start by defining the Ce for a Gabor window G(n) as, 

/=< \ a tm a lm ,r\ 
m 

where 

~a lm = / dnT(n)G(n)y/ m (n). (6) 
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Here T(n) is the observed temperature in the direction of the unit vector n, Yf m (n) is the spherical Harmonic function and 
G(n) is the Gabor window. We now find an expression for the expectation value of Ce- 

We will here use a Gabor window which is azimuthally symmetric about a point no on the sphere, so that the window 
is only a function of the angular distance from this point on the sphere cos 9 = n ■ no . Then one can write the Legendre 
expansion of the window as, 

2£+l 



G(9) = -=±-gtPt(<x*0) = ^^Y ft „(n)y; m (n ). 

One can also write, 
T(n) = ^ a lm Y lm (n). 

lm 

Inserting these two expressions in equation (^) one gets 

aim = y^a^'m' ^ gt"Y e *, m „(n ) / Y e * m (n)Y e , m ,(n)Y e ,i m //(n)dn 



^ O-i'm' ^ 9l"Yln m „{ho 
I'm 1 I" m" 

I £' £" 



{21 + l)(2£' + l)(2f" + 1) 

47T 



X 



-m m m 



£ £' I" 




(7) 
(8) 

(9) 
(10) 
(11) 



where relation (B3) for Wigner 3j Symbols were used. Using this expression, the relation (a1m a l'm') = Ce&ee'&mm 1 and the 
orthogonality of Wigner symbols (equation Bl), one can write (Ce) as, 



(12) 



With Ce we will always mean (Ce) when we are referring to the full sky power spectrum. In this expression, K(£,£') is the 
Gabor kernel, 

:i ( 2£" + 1) (if! £" x 2 
(4tt) 2 ioO 



K(i,£') = (21' +i)y] 



(13) 



The Legendre coefficients ge, are found by the inverse Legendre transformation, 



ge = 2n 



G(6)P e (cos6)dcosl 



(14) 



where 9c is the cut-off angle where the window goes to zero. One sees from the expression for the kernel, that there is no 
dependency on no. This means that (Ci) is the same, independent on where the Gabor window is centred. In the rest of this 
section we will study the shape of this kernel which couples the Ce on the apodised sphere. 
In Fig. |l| we have plotted the kernel for a Gaussian Gabor window, 



G(9) = e 
G(0) = 



-e 2 /(2a 





9<9 C , 
9>9 C , 



(15) 
(16) 



with 5 and 15 degrees FWHM (corresponding to a — 2.12° and a = 6.38°) and 9c ~ 3<r. One sees that the kernel is centred 
about £ — £' , and falls off rapidly. Fig. (g) shows the same for the corresponding top-hat Gabor windows, 

G(9) = 1 9 < 9 C , (17) 
G(9) = 9 > 6 . (18) 

The top-hat windows are covering the same area on the sky as the corresponding Gaussian windows in Fig. [j] (9c is the same). 
Ones sees that the diagonal is broader for the smaller windows indicating stronger couplings. Another thing to notice is that 
whereas the kernel for the top-hat Gabor window only falls by about 4 orders of magnitude from the diagonal to the far 
off-diagonal elements, the Gaussian Gabor kernel falls by about 8 orders of magnitude (the vertical axis on the four plots are 
the same). The smooth cut-off of the Gaussian Gabor window cuts off long range correlations in spherical harmonic space. 
One of the aims of the first part of this paper is to see how the pseudo power spectrum of a given disc on the sky (top-hat 
window) is affected by the multiplication with a Gaussian Gabor window. For this reason the pseudo spectrum will be studied 
for a top-hat and a Gaussian covering the same area on the sky. We will also study a top-hat window which has the same 
integrated area as the Gaussian window. The cut-off angle 9c = #mt for these windows is given by 
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Figure 1. The logarithm of the kernel K(£,£') describing the connection between the spherical harmonic coefficients Ci on the full sky 
and the corresponding coefficients Ci on the apodised sky via the relation Ci = / ],/ K(£,£')C(i . The figure shows the kernel for a 5° 
and f5° FWHM Gaussian Gabor window with 9q = 3<r (left and right respectively). 




Figure 2. Same as Fig. hi for top-hat Gabor windows covering the same area on the sky as the Gaussian windows. 
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Figure 3. The figures show slices of the kernel K(£,£') connecting the full sky and cut sky spherical harmonic coefficients. The full 
kernels are shown in Figs, [l] and ^ The slices arc taken at I = 200 and i = 500 for the 5 (upper plot) and 15 (lower plot) degree Gaussian 
Gabor window (dotted black line). The solid line is for the corresponding (same area on the sky) top-hat window Wa- The dotted 
coloured line is for the top-hat window W\ having the same integrated area as the Gaussian window. The kernels are here normalised so 
that the peak in the given slice has its maximum at 1. In this way one can easier compare the shape of the kernels. 



re=3a rS=e ita , 

/ G(8)dcos8 = / dcosO. (19) 

J 9=0 J 9=0 

In this section we will be comparing a Gaussian window (called Wo) having 8c = 3<r with a top-hat window (called Wa) 
having the same area on the sky (6c = 3<r) and with a top-hat window (called Wi) having the same integrated area (6c = #int). 

In Fig. |3j we have plotted slices of the kernel at I = 200 and £ = 500 for the 5 and 15 degree FWHM Gaussian Gabor 
windows (dashed line). The solid line is the corresponding kernel (same area on the sky) when using the top-hat Gabor 
window (Wa). One sees that the Gaussian window effectively cuts off long range correlations whereas the top-hat window 
is narrower close to the diagonal. The Gaussian window has larger short range correlations. The coloured lines show the 
slice of the kernel for a top- hat window having the same integrated area as the Gaussian window (Wi). These kernels have 
the s ame widths as th e kernels for the Gaussian windows, but the long range correlations are significantly larger. In ( |Hob 



son fc Magueijo 1996 ) it was shown that in the flat-sky approximation, the long range correlations are significant when the 



window has a sharp cut-off. On the sphere we see that even for a sharp top-hat window the long range correlations are damped. 



Fig. y shows how the width of the kernel gets narrower and the correlations smaller as the Gabor window opens up. The 
four kernels are shown for £ = 500 and the Gaussian windows have 5, 10, 15 and 30 degree FWHM with 8c = 3<r. The same 
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Figure 4. The figures show a slice of the kernel K(£,£') connecting the full sky and cut sky spherical harmonic coefficients. The slices 
are taken at I = 500 for a 5, 10, 15 and 30 degree FWHM Gaussian Gabor window (solid line) with a Oq = 3cr cut-off. The dotted 
line shows the kernel for a top-hat window Wa covering the same area on the sky. The coloured lines which are almost on top of the 
lines for the Gaussian Gabor windows show the kernels for a top-hat window Wi which has the same integrated area as the Gaussian 
windows Wq. The dashed lines which are almost on top of the dotted and solid line (and for this reason not so easily seen in the plot) 
are Gaussian fits to the curves. 



kernels for the top-hat windows Wa (dotted lines) and Wi (coloured line) are plotted on top. Gaussian fits are plotted on top 
of the kernels and show that the kernels are very close to Gaussian functions near the diagonal. 

In Fig. [B| we have plotted the relation between the FWHM width A£ of the kernel and the size A8 of the window for 
Gaussian and top-hat windows. The two curves are very well described by A£ = 220/#fwhm for the Gaussian window (Sfwhm 
in degrees) and A£ = 175/# r adms (# ra dius being the radius of the top-hat window in degrees) for the top-hat window. Clearly 
for a given observed area of the sky, multiplying with a Gaussian will increase the FWHM of the kernel. This is also what 
was seen in Figs ^| and We will see that this results in a lower spectral resolution for the Gaussian window compared to 
the top-hat window. But the lower long range correlations of the Gaussian window makes the shape of the pseudo power 
spectrum closer to that of the full sky power spectrum. 

In Fig. ^, we show the shapes of the Ci for Gaussian and top-hat windows compared to the full sky spectrum. The plots 
which were made using the analytical formula (^) show Ce for a 5 and 15 degree FWHM Gaussian Gabor window (solid 
line) cut at 3a. The corresponding spectrum for the top-hat Gabor window Wa is shown as dotted lines and for the top-hat 
window Wi as coloured lines. The spectra are normalised in such a way that they can be compared to the full sky power 
spectrum (dashed line). For the 5° FWHM window one can still distinguish the four lines. At this window size the pseudo 
spectra are very similar to the full sky spectra but with small deviations depending on the shapes of the kernel and the shape 
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Figure 5. The figure shows the uncertainty relation A£A9 = constant for a Gabor transform on the sphere. The solid line shows 
the width A£ of the Gabor kernel K (£,£') connecting the full sky and the cut sky power spectra when applying a Gaussian Gabor 
window with a cut 0q = 3o\ The FWHM is shown on the lower abscissa. The dashed line shows the width of the kernel for a top-hat 
window. The full radius of the top- hat window is shown on the upper abscissa. The curves are well descr ibed by A£ = 22O/0fwhm and 
A£ = 175 /0 ra dius f° r the Gaussian and top-hat windows respectively. As will be discussed in Section 3.2, this relation also describes the 
width of the correlation matrix of Ci . The width of the correlation matrix using a Gaussian window follows the relation in this plot times 
a factor 1.42. For the top-hat window, the width of the correlation matrix is the same as for the kernel shown in this figure. 



of the power spectrum. In this case the spectrum for the Gaussian window seems to be smaller at the peaks and larger at the 
troughs whereas the spectrum for the top-hat windows is always larger. 

For the 15° FWHM windows the pseudo spectrum using the Gaussian Gabor window are on top of the full sky power 
spectrum. For the top-hat windows it is still possible to distinguish the pseudo spectrum from the full sky power spectrum 
although the lines are still very close. The plot implies that the Ct could be good estimators of the underlying full sky Ce 
provided that the window is big enough. Note that for small windows, the Gaussian Gabor window makes the pseudo spectrum 
a better estimator than the pseudo-spectrum for a top-hat window at higher multipoles. In (Hobson & Magueijo 1996) it was 
shown in the flat-sky approximation that the pseudo power spectrum for small fields get significantly distorted, but that the 
shape of the pseudo spectrum gradually approaches the shape of the full sky power spectrum when the window gets larger. 
We see here that the same results applies to the treatment on the sphere. In the flat-sky approximation however, the error in 
estimating the average power spectrum from the pseudo power spectrum from one single realisation is bigger due to the long 
range correlations of the pseudo power spectrum coefficients in the flat-sky approximation. 



One feature which is very prominent is the additional peak at low I for the Gaussian window. The reason for this peak 
comes from the fact that the diagonal in the Gaussian kernel is broader than in the top-hat kernel for a top-hat window with 
corresponding area. For the low multipoles the power spectrum is dropping rapidly because of the Sachs- Wolfe effect and the 
lowest multipole Ct are much bigger than the Ct for higher multipoles. Since the Gaussian kernel is broad, the Ce at low 
multipoles will pick up more from the Ce at lower multipoles than the narrower top-hat kernel (see Fig. |^). These low multipole 
Ce have very high values compared to the higher multipole Ce and for that reason the Ci for the Gaussian window will get 
a higher value. This is illustrated in Fig. F] where a slice of the kernel at £ = 50 is shown for the 5° FWHM Gaussian Gabor 
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Figure 6. The windowed power spectra Of, for a 5 and 15 degree FWHM Gaussian Gabor Gabor window Wg cut at Bq = 3cr (solid 
line) and for a top-hat window Wa covering the same area on the sky (dotted line). The spectrum for the top-hat window W\ for which 
the integrated area of the window corresponds to the Gaussian is shown as a coloured line. All spectra are normalised in such a way that 
they can be compared directly with the full sky spectrum which is shown on each plot as a dashed line. Only in the first plot are all four 
lines visible. In the last plot, the full sky spectrum and the Gaussian pseudo spectrum (dashed and solid line) are only distinguishable 
in the first few multipoles. 

window (solid line) and the corresponding top-hat Wa (dashed line) normalised to one at the peak. The dotted line shows 
a typical power spectrum. Clearly the Gaussian kernel will pick up more of the high value Ct at low multipoles. Note that 
for the pseudo spectrum for the top-hat window Wi where the integral of the top-hat window corresponds to the integrated 
Gaussian window (coloured line) , there is also a peak at low multipole. The reason is that the width of the kernel is the same 
as for the Gaussian. 

In Fig. |i] we show the pseudo power spectra for a particular realisation using a 15 degree FWHM Gaussian window (upper 
plot) and a top-hat window Wa (lower plot). The pseudo spectra are compared to the average full sky spectra shown as a 
dashed line. The dark shaded area shows the expected la cosmic and sample variance on the pseudo spectra taken from the 
formulae to be developed in the next sections. The lighter shaded area shows only cosmic variance. Note that the pseudo 
spectrum for the Gaussian window is smoother than the pseudo spectrum for the top-hat window. This shows the lower 
spectral resolution of the Gaussian window due to the broader kernel. 



3 LIKELIHOOD ANALYSIS 

In this section we will show how the pseudo power spectrum can be used as input to a likelihood analysis for estimating the 
full sky power spectrum from an observed disc on the sky multiplied with a Gabor window. We will in this section concentrate 
on a Gaussian Gabor Wg window, but the formalism is valid for any azimuthally symmetric Gabor window. We start by 
showing that the pseudo-CV are close to Gaussian distributed which allows for a Gaussian form of the likelihood function. 
Then we show how the correlation matrices can be calculated quickly for an axisymmetric patch on the sky with uncorrelated 
noise. The extension to the more realistic situation with correlated noise and non-axisymmetric sky patches will be made in 
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Figure 7. The figure shows a slice of the kernel K(£,£') connecting the full sky and cut sky spherical harmonic coefficients. The slice is 
taken at £ = 50 for a 5 degree FWHM Gaussian Gabor window (solid line) and a corresponding top-hat window W\ (dashed line). The 
kernels are normalised to one at the peak. A typical power spectrum normalised to one at the quadrupole is plotted as a dotted line. 
The figure aims at explaining the extra peak in the pseudo power spectrum at low multipoles for the Gaussian Gabor window shown in 
Fig. |[ 



the next section. We will show the results of power spectrum estimations with different noise profiles and window sizes. We 
will also show that the use of a window different from the top-hat window can be advantageous for some noise profiles, even 
if the window has a lower spectral resolution than the top-hat window. 



3.1 The form of the likelihood function 

To know the form of the likelihood function, one needs to know the probability distribution of Ci. In Figs ^ and jw] we show 
the probability distribution from 10000 simulations with a 5° and 15° FWHM Gaussian Gabor window respectively. The 
dashed line shows a Gaussian with mean value and standard deviation found from the formulae given in the previous and 
next section. One can see that the probability distribution is slightly skewed for low £, but for high £ it seems to be very 
well approximated by a Gaussian. Also the small window shows more deviations from a Gaussian than the bigger window. In 
Fig. [H] we show that this result is not limited to the Gaussian window. The plot shows the probability distribution from a 
simulation with a top-hat Gabor window Wa covering the same area on the sky as the 15° FWHM Gaussian window. Also 
for this window the probability distribution is close to Gaussian. 

From the above plots it seems to be reasonable to approximate the likelihood function with a Gaussian provided the 
window is big enough and multipoles at high enough £ values are used, 

id-M^d 

C= —. (20) 
V2tt detM 

Omitting all constant terms and factors, the log-likelihood can then be written: 

L = d T M *d + In dot M. (21) 

Here d is the data vector vector which contains the observed Ce for the set of sample I- values t\. The data is taken from 
the observed windowed sky in the following way: 
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Figure 8. One realisation of the windowed power spectra. The upper plot shows a realisation of a pseudo power spectrum using a 15 
degree FWHM Gaussian Gabor window. The pseudo spectrum is normalised in such a way that it can be compared directly to the 
full sky spectrum which average is shown as a dotted line. The lower plot shows the same realisation using a corresponding top-hat 
window. The light shaded area shows lcr cosmic variance around the full sky average spectrum. The darker area shows lcr cosmic and 
sampling variance taken from the theoretical formula. On the lower plot, the pseudo spectrum with a top-hat window Wa having the 
same integrated area as in the upper plot is shown as a dashed line. 



d i = C h -{Ci i ). 



(22) 



The matrix M is the covariance between pseudo-Ci which elements are given by: 

Ma = {C k C tj ) - {C h ) (Ce . } . (23) 

In Appendix (|e|) and ([f]) we have found expressions which enable fast evaluations of di and Mij for signal and noise. 
These major results are are given in equations (E14), (F1J) and (F21) and the recursion which enables fast calculation of 



these expressions is given in equation (ZT5). In the derivations of the expressions for Mij, the rotational invariance of the 



(not-averaged) Ce shown in Appendix (|d|) was used. Because of this rotational invariance, all derivations can be done with the 
Gabor window centred at the north pole. In Fig. [l^ we used the full formula (equation E14) from Appendix (|ij) to calculate 
the signal correlation matrix for a typical power spectrum with a 15° FWHM Gaussian Gabor window (note that in the figure, 
the correlation matrix is normalised with the pseudo power spectrum). The correlation between Ce of different multipoles is 
falling of rapidly with the distance from the diagonal. Only exception being the small 'wall' at low multipoles which again 
comes from the coupling to the smallest multipoles which have very high values. 



3.2 Likelihood estimation and results 

Because of the limited information content in one patch of the sky one can not estimate the full sky Ce for all multipoles £. For 
this reason the full sky power spectrum has to be estimated in JV bln bins. Also the algorithm to minimise the log-likelihood 
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Figure 9. The probability distribution of Cp taken from 10000 simulations with a 5° FWHM Gaussian Gabor window truncated at 
9q = 3cr. The variable x is given as x = (Cp — (Cp))/\J {(Cp — (Cp)) 2 ). The dashed line is a Gaussian with the theoretical mean and 
standard deviation of the Cp. The plot shows the Cp distribution for £ = 50, I = 200, i = 500, and I = 800. The probabilities are 
normalised such that the integral over a; is 1. 



needs the different numbers to be estimated to be of roughly the same order of magnitude. For this reason we estimate for 
some parameters Db which for bin b is defined as 



Cp 



D b 



i{i+iy 



i b <£< 4+i 



(24) 



where lb is the first multipole in bin b. 

Since the Cp are coupled, one can not use all multipoles in the data vector, the covariance matrix would in this case 
become singular. One has to choose a number N nl of multipoles li for which one finds di. How many multipoles to use 
depends on how tight the Cp are coupled which depends on the width Aikern of the kernel (Fig. ^ or the width A£ COT of 
the correlation matrix. The width of the correlation matrix (normalised with the psuedo power spectrum) varies with window 
size in the same way as the width of the kernel varies with window size. In fact for the top-hat window these two widths are 
the same and for the Gaussian window we found that A£ COI w 1.42A£kem- The optimal number of N ln to use seems to be 
iV ln ~ 3/2 ^ max /A^kcrn- To use a lower N' n increases the error bars on the estimates and a higher N ln does not improve the 
estimates. One can at most fit for as many Cps as the number of Cp (N ln ) one has used in the analysis. So one needs to find 
a number iV bm < N' n of bin values Db from which one can construct the full sky power spectrum Cp. 

In Appendix (jEh and (pj) we found that the full correlation matrix can be written as 



Mi. 



Ml + M£ + Mi 



(25) 
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Figure 10. Same as Fig. bl but for a 15° FWHM Gaussian Gabor window. 



where Mfj is the noise correlation matrix which has to be precomputed for a specific noise model (analytically or by Monte 



Carlo as will be shown in Section (|4| 
Mfj = J2J2 DbDb ' x( - b > b '' i 'fi' 



The signal and signal-noise cross correlation matrices are on the form 



(26) 
(27) 



where the x-functions can be precomputed using formulae (E16) and (F26) 



We will now describe some test simulations to show how the method works. As a first test, we used the same model as 

0.7, Q, h h 2 = 0.03 and n s 



was used in (Hivon et al. 2002), with fi 



1, Ov 



0.975. These are the parameters from 
the combined Maxima-Boomerang analysis (Jaffe et al. 2001). We used a circular patch with 15.5° radius covering the same 



fraction of the sky as in (Hivon et al. 2002). Using HEALPix we simulated a CMB sky using a standard CDM power spectrum 

512 in HEALPix language). We smoothed the map with a 10' beam and added 

12° was used with a cut-off 6c = 3a. 



with Z max = 1024 and a 7' pixel size (iV a ide 

non-correlated non-uniform noise to it. Here a Gaussian Gabor window with FWHM - 

For the likelihood estimation, we had 7V bin = 20 full sky Ce bins and iV in = 100 Ct values between I = 2 and I = 960. In Fig. 
|l3| one can see the result. The shaded areas are the expected la variance with and without noise. These were found from the 
theoretical formula 

AC 5 = J—(C b + N b ), (28) 
where Ni, is the noise 'on the sky', v b is the effective number of degrees of freedom given as 

(24 + l)A£/ sky ^, (29) 
and the wt factors are dependent on the window according to 
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Figure 11. Same as Fig. [Io| but for a top- hat Gabor window Wa covering the same area on the sky. 



sky t 



1 

4% 



dhG 1 (n) 



(30) 



This formula is exact for a uniform noise model (Hivon et al. 2002) and is similar to the one used in most publications. 
It is in this case a very good approximation even with non-uniform noise. In the next example however we will show that the 
formula has to be used with care. In the figure, the error bars on the estimates are taken from the Fisher matrix and the 
signal-to-noise ratio S/N — 1 at i = 575. 

In Fig. O, we have plotted the average of 1000 such simulations, with different noise and sky realisations. From the plot, 
the method seems to give an unbiased estimate of the power spectrum bins D^. For the lowest multipoles the estimates are 
slightly lower than the binned input spectrum. This is a result of the slightly skewed probability distribution of Ci for small 
windows at these low multipoles (see Figs ^ and |l^). The probability that the Ce at lower multipoles have a value lower 
than the average (Ce) is high and the assumption about a Gaussian distribution about this average leads the estimates to be 
lower. When a bigger area of the sky is available such that several patches can be analysed jointly to give the full sky power 
spectrum, this bias seems to disappear. This will be shown in Section ( |4.l[ ). 

In this example one can see that the la error bars from Monte Carlo coincide very well with the theoretical error shown 



as shaded areas from the formula in (Hivon et al. 2002). Note that the error bars on the higher I are smaller than in (Hivon 



et al. 2002 ) because the noise model used in that paper was not white. Also they took into account errors due to map making 
which is not considered here. 



As a next test, we used a simulation with the same resolution and beam size. The power spectrum was this time a 
standard CDM power spectrum. We used an axisymmetric noise model with noise increasing from the centre and outwards 
to the edges (see Fig. |l6| ). This is the kind of noise model which could be expected from an experiment scanning on rings, 
with the rings crossing in the centre. We now use a circular patch with 18.5° radius and a FWHM — 15° Gaussian Gabor 
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Figure 12. The figure shows the correlation matrix M^i between pseudo spectrum coefficients normalised with the pseudo spectrum 
((CjCj,) - (Cj)(Cj,))/((Cj)(Cj,)) for a 15 degree FWHM Gaussian Gabor window. A standard CDM power spectrum was used to 
produce this matrix. 



window cut at 6q = 3a. An interesting point now is that the Gabor window is decreasing from the centre and outwards, 
which is opposite of the noise pattern. This gives the pixels with low noise high significant in the analysis and the pixels 
with high noise low significance. One sees from the expressions for the signal and noise pseudo power spectra that the Gabor 
window will work differently on both. This means that S/N is different depending on the Gabor window. For this case, we 
have plotted the average pseudo power spectrum for signal and noise separately in Fig. |l^. This shows the described effect. 
The S/N ratio is much higher for the Gaussian Gabor window in this case, favouring the use of this window for the analysis. 

For this example we used again N bln — 20 and N m = 100. The result is shown in Fig. In Fig. |l^ the average over 
5000 simulations and estimations is shown. One can see that the estimate also does well beyond £ = 520 which is where the 
effective S/N = 1. The method is still unbiased. The error bars in the part where noise dominates are here lower than the 
theoretical approximation (^8|) shown as the dark shaded area. The dashed lines show the theoretical la variance taken from 
the inverse Fisher matrix which here gives a very good agreement with Monte Carlo. 

In Fig. [l^ we show the average (over 5000 estimations) of the correlation between the estimates Dt between different bins. 
The figure shows that the correlations between estimates are low and in fact in each line all off-diagonal elements are more 
than an order of magnitude lower than the diagonal element of that line. In Fig. ^o] we show that the probability distribution 
of the estimates in Fig. ^8| is almost Gaussian. 

To test the method at higher multipoles we also did one estimation up to multipole t — 2048. We used HEALPix resolution 
A'side = 1024 and simulated a sky with a 8' Gaussian beam and added noise from a strongly varying non-uniform noise model. 



Both the beam and noise level were adjusted according to the specifications for the Planck HFI 143GHz detector (Bersanelli 
et al. 1996). We used again a 15° FWHM Gaussian Gabor window cut 3a away from the centre. In the estimation we used 
N bin = 40 bins and N in = 200 input Ci between 1 = 1 and I = 2048. The average of 100 such simulations is shown in Fig. 
|il| . Each complete likelihood estimation (which includes a total of about 25 likelihood evaluations) took about 8 minutes on 
a single processor on a 500MHz DEC Alpha work station. 

In Fig. we have plotted the average of 300 estimations where the input data was the Ce from simulations with a 
fixed CMB realisation and varying noise realisation. The dotted line shows the N" 1 Cis (without noise) used as input to the 
likelihood. The histogram is as before the input pseudo spectrum without noise binned in N h ' n bins. This means that each 
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Figure 13. The analysis of an input model with f^total = li = 0.7, Cl^h = 0.03 and n s = 0.975. We used a non-uniform white noise 
model with S/N = 1 at £ = 575. The dotted line is the input average full sky power spectrum and the histogram shows the binned pseudo 
power spectrum for this realisation (without noise). We used 7V bm = 20 bins and 7V m = 100 input sample points to the likelihood. The 
shaded areas around the binned average full sky spectrum (which is not plotted) are the theoretical variance with and without noise. The 



bright shaded area shows cosmic and samp 
due to noise was calculated using formula ( 2 



c variance and the dark shaded area also has variance due to noise included. The variance 
for uniform noise. The la error bars on the estimates are taken from the inverse Fisher 



matrix. The solid line increasing from the left to the right is the noise power spectrum. 



histogram line shows the average of the dotted line over the bin. One can see that the estimated power spectrum is partly 
following the N ln input Ce and partly the binned power spectrum. 

Finally, we made a comparison between a top-hat window and a Gaussian Gabor window. In this case we used uniform 
noise, so that the Gaussian and top-hat Gabor windows have the same S/N ratio which we set to 1 at I = 520. We used a 
disc with 18° radius, 7V in = 200 and iV bin = 20. In Fig. M one can see the result. The lower plot shows the estimates with the 
Gaussian Gabor window (15° FWHM) and the upper with the top-hat window. The Gaussian window is suppressing parts 
of the data and for this reason gets a higher sample variance than the top-hat. This effect is seen in the plot. Clearly when 
no noise weighting is required the top-hat window seems to be the preferred window (which was also discussed in (Hivon et 
al. 20(||)). This chapter has been concentrating on the Gaussian window to study power spectrum estimation in the presence 
of a window different from a top-hat. It has been shown that a different window can be advantageous when the noise is not 
uniformly distributed as one can then give data with different quality different significance. 



4 EXTENSIONS OF THE METHOD 

A real CMB experiment usually does not observe an axisymmetric patch of the sky. Usually the noise between pixels is also 
correlated. In order to take these two issues into account we will discuss two extensions of the method. The formalism for 
the extensions are worked out and some simple examples are shown. Further investigations of these extensions are left for a 
future paper, where the analysis of MAP and Planck data will be discussed. To be able to analyse non-axisymmetric parts 
of the sky, we propose to split the area up into several axisymmetric pieces and use the pseudo- Ce from all these patches in 
the data vector of the likelihood and in this way analyse all patches jointly. We show that if the patches are not overlapping, 
the correlation between pseudo-CV from different patches is so weak that it can be neglected. To deal with correlated noise, 
we propose to use Monte Carlo simulations to find the noise correlation matrix. We demonstrate that for uncorrelated noise, 
one needs a few thousand simulations in order for the error bars on the Ce estimates not to get larger than when using the 
analytic expression. 
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Figure 14. Same as in Fig. [13] but this is an average over 1000 simulations and estimations. The histogram is now the binned average 
full sky power spectrum. The error bars on this plot are the lcr variances taken from Monte Carlo. 



4.1 Multiple patches 

It has been shown how one can do power spectrum estimation on one axisymmetric patch on the sky. The next question that 
arises is what to do when the observed area on the sky is not axisymmetric. In this case one can split the area into several 
axisymmetric pieces and use the Ct from each piece. Then the Ct from all the patches are used together in the likelihood 
maximisation. The first thing to check before embarking on this idea is the correlation between Cts in different patches. In 
Appendix OT) the analytical formula for the correlation matrix describing the correlations between Ct for different patches was 



derived (equation H6 and (H10|). With these expressions we can check how the correlations decrease as the distance between 



the two patches increase. 



After the expression (H6) was tested with Monte Carlo simulations, we computed the correlations between Ct for two 
patches A and B where we varied the distance 9 between the centres of A and B. We used a standard CDM power spectrum 
and both patches A and B had a radius of 18° apodised with a 15° FWHM Gaussian Gabor window. In Fig. ^ we have 
plotted the diagonal of the normalised correlation matrix ((C^Cf) — (Ct) 2 ) / {Ct) 2 ■ The angels 9 we used were 6°, 12°, 24°, 
30° , 36° and 180° . One sees clearly how the correlations drop with the distance. In the two last cases there were no common 
pixels in the patches. As one could expect, the correlations for the largest angels (the first few multipoles) do not drop that fast. 

In Fig. we have plotted two slices of the correlation matrix of Ct for a single patch at I = 400 and I = 800. On the 
top we plotted the diagonals of the correlation matrices for separation angle 9 = 30°, 9 = 36° and 9 = 180°. One sees that 
for the case where the patches do not have overlapping pixels, the whole diagonals have the same level as the far-off-diagonal 
elements in the 8 = 0° matrix. When doing power spectrum estimation on one patch, the result did not change significantly 
when these far-off-diagonal elements were set to zero. For this reason one expects that when analysing several patches which 
do not overlap, simultaneously, the correlations between non-overlapping patches do not need to be taken into account. Note 
however that for the 8 = 30° which means that there are only a few overlapping pixels, the approximation will not be that 
good as the level is orders of magnitude above the far-off-diagonals of the 9 = 0° matrix. Another thing to note is that for 
the lowest multipoles, the correlation between patches is still high but we will also assume this part to be zero and attempt 
a joint analysis of non-overlapping patches. 

The full correlation matrices for and 30 degree separation are shown in Fig. The figures show how the diagonal is 
dropping relative to the far off-diagonal elements. 
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Figure 15. The plots show average signal and noise pseudo power spectra plotted separately. The spectra are normalised so that they 
can be compared to the full sky power spectrum. The solid and dashed curves which almost fall together are the signal pseudo power 
spectra for a 15 degree FWHM Gaussian Gabor window Wq and a corresponding top-hat window Wj± respectively. In the upper plot 
the noise model shown in Fig. |l6| was used. This noise model is increasing from the north pole and down to the edges of the patch. This 
is opposite of the Gaussian window and for this reason the Gaussian window gets higher signal to noise ratio. The solid horizontal line in 
the upper plot shows the noise pseudo power spectrum for the Gaussian window and the dashed horizontal line shows the noise pseudo 
power spectrum for the top-hat window. In the lower plot a uniform noise model was used so that the noise pseudo power spectra fall 
together and are shown as a solid vertical line. The figure shows how a Gabor window different from a top-hat can be used to increase 
signal to noise. 



In Fig. |27| the full correlation matrices for 36 and 180 degree separation is shown. For 36 degree separation one can see 
that the diagonal has almost disappeared with respect to the rest of the matrix whereas for 180 degree the diagonal has 
vanished completely. But the 'wall' at low multipoles remains. 

In Fig. |2^, we did a separate Ce estimation on 146 non-overlapping patches with radius 18° apodised with a 15° FWHM 
Gaussian Gabor window. The patches where uniformly distributed over the sphere and uniform noise was added to the whole 
map. The figure shows the average of the 146 Ce estimates. One can see that the estimate seems to be approaching the full 
sky power spectrum even at small multipoles. 

Finally we made a joint analysis of all the 146 patches. The idea was to extend the data vector in the likelihood so that 
it contained the N ln Ce from all the 146 patches. The data vector can then be written as d = {di, d2, di46} where d; now 
denotes the whole data vector for patch number i. From the results above it seems to be a good approximation to assume that 
the correlation between Ce from different patches is zero so that the correlation matrix will be block diagonal. Each block is 
then the correlation matrix for each individual patch. The log-likelihood can then simply be written as 

146 146 

L = y^d? , Mr 1 d i + y]lndetMi, (31) 

i=l i=l 

where Mi is the correlation matrix for patch number i. In Fig. [29] the result of this joint analysis is shown. One can see that 
the full sky power spectrum is well within the two sigma error bars of the estimates. 



Noisemap 
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Figure 16. The noise map with noise increasing from the north pole and downwards. The figure shows a gnomic projection with the 
north pole in the centre. 



The method of combining patches on the sky for power spectrum analysis will be developed further in a forthcoming 
paper. 



4.2 Monte Carlo simulations of the noise correlations and extension to correlated noise 



The computation of the noise correlation matrix in the general case without any approximations takes ■U Npi^l^^N 111 ) 2 which 
is approximately (N ln ) 2 when iVpi x is large (iVpi x « ^ax)- When N ln is getting large this can be calculated quicker using 



Monte Carlo simulations (as was done in ( |Hivon et al. 2002| )). Finding the Cf from one noise map takes 0(N^( 2 ) operations 
so using Monte Carlo simulations to find the whole noise matrix takes 0(N^ 2 iV s i m ) operation where -/V s j m is the number of 
Monte Carlo simulations needed. So when iV s i m << (N? n ) it will be advantageous using MC if this gives the same result. 

Also when the noise gets correlated, the analytic calculation of {a^ m af, m ) will be very expensive. In this case another 
method for computing {a^ m a^ lrn } will be necessary and Monte Carlo simulations could also prove useful. For a given noise 
model several noise realisations can be made and averaged to yield the noise correlation matrix and the (5f m aJ) ra ) term needed 
in the estimation process. This is of course dependent on a method for fast evaluation of maps with different realisations. 

In figure |3(j, the result of Ct estimation with noise matrix and {a^ m a^i rn ) computed with Monte Carlo is shown. Again a 
standard CDM power spectrum was used with a non-uniform white noise model and a 15° FWHM Gaussian Gabor window. 
In the Ct estimation N ln = 200 Ct were used and N bln = 20 power spectrum bins were estimated. The noise matrices were cal- 
culated using (1) the analytical expression, (2) MC with iV sim = 20000 and (3) MC with iV sim = 1000. In Fig. |l], a slice of the 
correlation matrices for the different cases is shown for I — 500. The dashed line (case (3)) follows the solid line (case (1)) to a 
level of about 10 -2 of the diagonal. The dotted line (case (2)) is roughly correct to about 10 _1 times the value at the diagonal. 



We did 100 estimations for each case and the average result is plotted in Fig. BOl The big dots are the results from case 
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Figure 17. Same as Fig. |l3| but for a standard CDM model. The noise is increasing from the centre and out to the e dge s while the 
Gaussian Gabor window has the opposite effect, giving an increased significance to pixels with less noise. As in Fig. ha the shaded 
areas show the analytically calculated variance using the 'naive' formula for the uniform noise case. The dashed lines show the expected 
variance using the inverse of the Fisher matrix. 




Multipole I 

Figure 18. Same as Fig. |l^ but with 5000 simulations and estimations averaged. The histogram now shows the binned average full sky 
power spectrum. The error bars on the estimates are here the la averages from Monte Carlo. 
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Figure 19. The average correlation matrix N(b,b') = (Df,D b i )/((-D(,) {Dy )) — 1 of the estimates in Fig. [u]. The negative elements are 
coloured. 



(1), the crosses on the right hand side are the results from (2) and the crosses on the left hand side the results from (3). 
The average estimates seem to be consistent, only in the highly noise dominated regime they start to deviate. For case (2), 
the error bars are for some multipoles higher and for some lower than the analytic case. The differences are at most 3%. We 
conclude that using this many simulations, the error bars do not increase significantly over the analytic case. For case (3) the 
error bars are up to 17% higher (and only higher) than the analytic case, ft seems that 1000 simulations was not sufficient to 
keep the same accuracy of the estimates as when using analytic noise matrices. To keep the error bars close to the analytic 
case, it seems that a few thousand simulations are necessary. 



5 CONCLUSION 



In this paper, we propose to use the spherical harmonic transform of the sky apodised by a window function, or Gabor 
transform, as a fast and robust tool to estimate the CMB fluctuations power spectrum. It is known that the coupling between 
modes resulting from the analysis on a cut sky affects the shape of the measured pseudo power spectrum and the statistics 
of the Ci coefficients ((Wandelt, Gorski & Hivon 2000; Hivon ct al. 2002] ) and reference therein). In the case of axisymmetric 



windows we can compute analytically (in about ^ operations) the kernel relating the cut sky power spectrum to the full 
sky one for a Gaussian and top-hat profile we give an analytical relation between the spectral resolution attainable and the 
size of the sky window. Studying windows of different sizes, we show that for windows as small as 36 degrees in radius, the 
measured power spectrum is undiscernable from the true one for I larger than about 50. 

Noting that for large multipoles (t > 100 for windows with radius larger than 36 degree) the statistics of the pseudo-CV 
coefficients measured in Monte Carlo simulations is close to Gaussian, we suggest the use of the pseudo power spectrum as 
input data vector in a likelihood estimation. For the first time, we show how the correlation matrix between the pseudo power 
spectrum coefficients obtained on an axisymmetric window of arbitrary profile can be computed rapidly for any input power 
spectrum, based on a recurrence relation. The computation of the correlation matrix needs a precomputation (independent of 
the power spectrum) of £ma,*N m (N ln ) 2 operations and each calculation of the correlation matrix with a given power spectrum 
takes (iV m ) 2 (jV bln ) 2 operations, where N ln is the number of input pseudo-CV coefficients used, jV bln is the number of estimated 
Ci bins and N m is a window dependent factor (N m w 200 for a Gaussian window and N m « 400 for a top-hat window). 
The noise correlation matrix can also be computed by recurrence. For axisymmetric noise this is very quick (N p i x £ 



operations). For general non-uniform noise this takes some more time (between (A rln ) 2 ^/A r p i x i? max and (A?i n ) 2 yj N p i^£ 



max 
2 

max 
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Figure 20. The probability distribution of the estimates Dt, in Fig. [l8| The variable x is given as x = (D;, — (£);,))/ ((Dt, — (D^)) 2 ). 
The dashed line is a Gaussian with mean and standard deviation taken from Monte Carlo. The plot shows bin estimates centred at 
t = 225, t = 425, t = 525 and i = 725. 



operations dependent on the window profile and number of approximations). For a Gaussian window with a sharply varying 
noise profile and a patch of sky similar in size to the one observed by BOOMERANG (about 2% of the sky) it takes about 
a day on one single 500MHz processor. This is the computationally heaviest part of the method but this has to be done only 
once. The inversion of the correlation matrix, which is the leading problem when doing likelihood analysis, is now overcome, 
as the size of the correlation matrix is so small that inversion is feasible. In the standard likelihood approach, the correlation 
matrix has dimensions iVpi x x iVpi x which needs N^ ix operations to be inverted. In our approach, the size of the correlation 
matrix is 7V m x N ln which in our example N m — 200 is inverted in a few seconds. 

By doing Monte Carlo simulations of different experimental settings, we shown that the likelihood estimator is unbiased. 
The error bars were found using the inverse Fisher matrix and compared to the error bars obtained from Monte Carlo. There 
was an excellent agreement between the two sets of error bars. In (Hivon et al. 2002) it was shown that using a Gaussian 
apodisation suppresses the signal such that the error bars on the estimated power spectrum becomes larger. In this paper we 
have shown that using a different window than the top-hat window can be important for increasing the signal-to-noise ratio 
in data with non-uniform noise. We applied a Gaussian window to an observed disc which had the noise level increasing from 
the centre of the disc and outwards similar to what one can expect around the ecliptic poles in scanning strategies like the 
ones of MAP and Planck. In this case the Gaussian window has a high value in the centre where signal-to-noise per pixel is 
high and a low value close to the more noise dominated edges. We shown that for this noise profile using a Gaussian window 
increased the signal-to-noise ratio significantly over the top-hat window, showing that adapting the window to downweight 
noisy pixels gives better performance than a simple uniform weighting. 

Finally two extensions of the power spectrum estimation method were discussed. First it was shown that for observed 
areas on the sky which are not axisymmetric, one can cover the area by several axisymmetric patches and make a joint analysis 
of the pseudo power spectrum coefficients from all the patches. Each of the patches can have a different window in order to 
optimise signal-to-noise in each patch. This method will be extended in a forthcoming paper where we will discuss the use of 
the method for analysing MAP and Planck data sets. We also shown that the calculation of the noise correlation matrix can 
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Figure 21. Same as Fig. [14] for 100 simulations where the beam and noise level was set according to the specifications of the Planck 
143GHz channel. Again a 15° FWHM Gaussian Gabor window was used. The power spectrum was estimated in 40 bins between £ = 2 
and I = 2048 



be quicker by Monte Carlo simulations if the number of pixels in the observed area is huge (about 10 6 pixels but dependent 
on the window shape). This may also be used in the case of correlated noise. We shown that a few thousand simulations 
are necessary to get the same accuracy in the power spectrum estimates as when using the analytic formula for the noise 
correlation matrix. 



In ( Hansen fc Gorski 2002 ) we show that the power spectrum estimation method presented in this paper can easily be 
extended to polarisation. By extending the data vector in the likelihood to have also the pseudo-C^ from polarisation, one 
can in a similar way estimate for the temperature and polarisation power spectra jointly. 
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Figure 22. The average of 300 estimations where the input Ci were taken from simulations with a fixed CMB realisation but varying 
noise realisations. The dotted line is the N ln input Cf from the CMB realisation without noise. The histogram is the same spectrum 
binned in JV bln bins. The dashed line is the binned average full sky power spectrum from which this realisation was made. The shaded 
areas around the binned full sky spectrum show the variance with (dark) and without (bright) noise. The solid line rising from the left 
to the right is the noise power spectrum. 
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APPENDIX A: ROTATION MATRICES 

A spherical function T(n) is rotated by the operator D(a/3'y) where a/37 are the three Euler angles for rotations (See T.Risbo 
1996) and the inverse rotation is D(— 7 — — a). For the spherical harmonic functions, this operator takes the form, 
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Figure 23. Estimates of C'e using a Gaussian Gabor window (lower plot) and a top-hat window (upper plot). Here we used a uniform 
noise model with S/N = 1 at I = 520. The dotted line shows the average full sky power spectrum and the histogram shows the input 
pseudo power spectrum without noise for this realisation, binned in the same way as the estimates. The bright shaded area shows the 
cosmic and sample variance around the binned average spectrum (not plotted). The dark shaded area has the variance due to noise 
included. The la error bars on the estimates are taken from the inverse Fisher matrix. The solid line increasing from left to right is the 
noise power spectrum. 



Y tm (&')= J2 D e m , m (a^)Y em ,(n), (Al) 
m'=-e 

where D l m , m has the form 

DL. m {afo) = e lm ' a d l m , m (t3)e im \ (A2) 
Here d e m , m (f3) is a real coefficient with the following property: 

d e m > m (P) = d e mrn '{-/3). (A3) 
The D-functions also have the following property: 

(a^) = 

(a2/3272)-Dm"m( a l/3l7l), (A4) 



where (a/37) i s the result of the two consecutive rotations (ai/3i7i) and (02/3272). 
The complex conjugate of the rotation matrices can be written as 

£>mm' = (— l) m+m D(_ m )(_ m /). (A5) 
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Figure 24. The correlation between Cg, between two patches A and B with an angular distance 9 between the centres. A normal CDM 
power spectrum was used and the patches had an 18° radius apodised with a 15° FWHM Gaussian Gabor window. The figure shows 
the diagonal of the normalised correlation matrix {{C^C^) — (C p) 2 ) / (C p) 2 where of course {Ci) = (C*) = (C^) The angles used are 
(from top to bottom on the figure) 0°, 6°, 12°, 18°, 30°, 36° and 180°. 



APPENDIX B: SOME WIGNER SYMBOL RELATIONS 

Throughout the paper, the Wigner 3j Symbols will be used frequently. Here are some relations for these symbols, which are 
used. The orthogonality relation is, 

y,( e e , e ',, )( 1 e , Z> ) = (2/ , +i)- i ww". (bi) 

^ \ m m m I \ m m M J 

mm' 

The Wigner 3j Symbols can be represented as an integral of rotation matrices (see Appendix^)), 

^ dcosed^Dt im ,Di 2m ,Di 3m , = ( , , , . B2 

8n' 1 / 1 i J 2 J 3 \ mi ni2 m? I \ m, m 9 rn-i I 



This expression can be reduced to, 



;(2£+l)(2l' + l)(2£" + l) / t £' I" ,, . , 
dnF, m (n)W(ft)^»m»(ft) = \/^ A 7 A -[ n • ( B3 ) 



4tt V m m' m" / V 



APPENDIX C: RECURRENCE RELATION 

It is important for the precalculations to the likelihood analysis that the calculation of h(l, I' ,m) is fast. For this reason 
a recurrence relation for h(£,£',m) would be helpful. To speed up the calculation of the noise correlation matrix for non- 
axisymmetric noise, it would also help if one had a more general recurrence relation for h(£, £', m, m'). We will now show how 
to find such a recurrence for these objects which we now call A™ e m to simplify notation (and for the notation to comply with 



(Wandelt, Gorski fc Hivon 200C)). The definition is again, 



AP e m = / G(n)Y/ m (n)Y fW (n)dn, (CI) 



where G(n) = G(6, (f>) is a general function and F( m are the spherical harmonics which can be factorised into one part 
dependent on 9 and one dependent on <j> in the following way, 
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Figure 25. slices of the correlation matrices which diagonals arc shown in Fig. The solid lines (thin and thick) show a slice of the 
= 0° correlation matrix at i = 400 and I = 800 respectively. The dashed lines (thin and thick) show the diagonal of the correlation 
matrices for 8 = 36° and 8 = 30° respectively. The dotted line is the diagonal of the 8 = 180° matrix. 



Y em (9,4>) = \ em (cos9)e 1</m \ 
Now writing, 

Ap/™ = / dcos 6\i m (cos 9)\i>i m i (cos 9) I d<^e~ 1 '* ( - m ~ m 'G(#, ( 



d cos 9\ tm (cos 9)X t , m , (cos 9)F m , m (9), 



(C2) 

(C3) 
(C4) 



where F m i m (9) is simply the Fourier transform of the window at each 9. The quantities A™ t m and F m i m (9) are in general 
complex quantities obeying, 



Apt" 1 — {A™™ ) F m i m (9) — (F mm '(9)) 
The Ap e m can be expressed as 



a m m 



v/(2l' + l)(2l + l) l (£>-rn>)\(£-rn)\ > m 
2 \j (£' + m')\{e + m)\ tu ' 



where I™i m is defined as: 

I™e m = / F m > m (x)PP(x)PP (x)dx. 

J a 

The following relation for the Legendre Polynomials will be used: 
m _^-m + l £ + m m 

xPe - 2£ + i Pe+1 + WTi 

We now define the object X™ e m as 



All 1 



F m i m (x)xPe Pgr dx 



(C5) 



(C6) 



(07) 



(08) 



(09) 



Using relation (C8) in this definition, one gets, 
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Figure 26. The figure shows the normalised correlation matrices M(l,£') = ((C^Cf) - {C^){Cf))/({C^)(Cf)) between pseudo 
spectrum coefficients for two patches A and B of 18° radius and with degree (left plot) and 30 degree separation. A Gaussian Gabor 
window with 15 degree FWHM was used. The aim of the plot is to show how correlations between C( from different patches drop when 
the distance between the two patches is about the FWHM of the Gaussian kernel. 




Figure 27. This figure shows the same as Fig. KM but for 36 and 180 degree separation of the patches. 
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Figure 28. The average of 146 individual power spectrum estimations of 146 non-overlapping patches on the same CMB map with 
uniform noise added to it. The patches all had radius 18° degrees apodised with a 15° FWHM Gaussian Gabor window. The histogram 
shows the binned average of the 146 Cg from the different patches without noise. The dotted line is the average full sky power spectrum 
and the shaded areas around the binned full sky power spectrum (not plotted) show the theoretical lcr variance with (dark) and without 
(bright) noise. The solid line rising from the left to the right is the noise power spectrum. 



, / _ m _i_ 1 , 

■A-f'f — — J-t'(t+l) 



+ 



2t + T ' (e ~ x) 



21 + I 

One can also exchange (£,£') and (m,m') to get 

i (' — m' 4- 1 / /' 4- m' t 
' - 21' + 1 W+V 21' + 1 W- 1 ! 
Taking the complex conjugate of the first expression and subtracting the last, one has 

+ m 



{XI 



~t~ 1 / rm' m \* | 



21 + 1 
£' - m + 1 
2£' + 1 



l £(£' + l) 



' / jm m \ 

2l+l { e ' ( - e - 1)) 

jmm 



21' + 1 



Then setting 1=1—1 one gets 
21' - 1 fl-m + l 



jm m 



21- 



- rm' ' m . * rm' m 

-i(e'-i)(t+i) + 21 + 1 



£' + m - 



2P 



- 1 (e>-2)i 



Using equation (C6), one can express this as 
1 



Am m 



!)((£+ 1)2 _ m 2) , 



(2^+l)(2€ + 3) 



(4^' 2 - 1)(<? 2 -m 2 ) 



4£ 2 



,4 



(e'-i)(i-i) 



(2£' + !)((£' -I) 2 



2/ 7 



' Am m 



(C10) 
(Cll) 

(C12) 
(C13) 

(C14) 
(C15) 



which is the final recurrence relation. The A™,™ elements must be provided before the recurrence is started. Then for each 
(m,m'), set £' — m' + 1 and let £ go from £' and upwards, then set £' = m' + 2 and again let I go from £' and upwards. 
Continue to the desired size of £' . Note that, in order to get all objects up to ^4-£^™f max one always needs to go up to I — 2^ max 
during recursion. This is because of the term which demands an object indexed (£ + 1) in the previous £' row. 
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Figure 29. The result of a joint analysis of 146 patches on the same CMB map. The solid line shows the average full sky power spectrum, 
the histogram shows the binned full sky power spectrum and the shaded boxes show the expected 2<r deviations due to noise, cosmic 
and sample variance. The sizes of the shaded boxes were calculated from the formula for uniform noise (equation The dots show the 
estimates with 2a error bars taken from the Fisher matrix. As before the rising solid line is the noise power spectrum and the vertical 
line shows where S/N = 1. 



To start the recurrence, one can precomputed the A™™ factors fast and easily using FFT and a sum over rings on the grid. 
F.ex. for the HEALPix grid, we did it the following way, 

JV r -l 

<m'm _ \ , >f \r \ * -2-nij/N r (m-m') r , /pirt 

A m'l — / J A m'm' A lm / j e UYj ! l^ 1D J 

r 3=0 

where the last part is the Fourier transform of the Gabor window, calculated by FFT, r is ring number on the grid and j is 
azimuthal position on each ring. Ring r has N r pixels. 

It turns out that the recurrence can be numerically unstable dependent on the window and multipole, and in order to 
avoid problems we (using double precision numbers) restart the recurrence with a new set of precomputed A™ t m for every 
50th H! row. However for most windows and multipoles that we tested the recurrence can run for hundreds of ^-rows without 
problems. 



APPENDIX D: ROTATIONAL IN VARIANCE 

It was shown that the average (Ce) is invariant under rotations of the Gabor window. We will now show that the non-averaged 
Ce are rotationally invariant under any rotation of the sky AND Gabor window by the same angle. This fact justifies that we 
can always put the window on the north pole since this simplifies the calculations. In the following we will use the rotation 
matrices D l mm , described in Appendix (|a|). Consider a rotation of the sky and window by the angles (—7 — f3 — a). Then the 
&£m becomes, 

aZ = J dh[b{-~t -(3- Q)T(n)G(n)]F/ m (n). (Dl) 
If one makes the inverse rotation of the integration angle n, one can write this as 
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Figure 30. Same as Fig. ^ but with a different noise model. Here the average of 100 estimations is shown. The big dots in the middle of 
the bins show the result using analytical expressions for the noise matrices. The crosses on the left side of each big dot show the results 
of using noise matrices from -/V a ; m = 1000 MC simulations. The crosses on the right side are for N s i m = 20000 noise simulations. 




Figure 31. A slice of the noise correlation matrix at multipole i = 500. The correlation matrix was evaluated using the analytical 
formulae (solid line), 20000 MC simulations (dashed line) and 1000 MC simulations (dotted line). The matrix is here normalised to be 
1 at the diagonal. 
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&Z = J dnT(n)G(n)[D*(a/3 7 )Y/ m (n)], (D2) 
which is just 



~rot 



^Dt^iafiy) J dnT(n)G(n)Y? m ,(n). (D3) 



One can identify the last integral as the normal ag. m . 

a\°l = £>m'm( a /37)2<?m. (D4) 

m' 

So the at m are NOT rotationally invariant. Rotation mixes m-modes for a given lvalue. 
Now to the Ci. One has that 

2£+i a e°n a im (D5) 



rot. 



= 2£ 1 + i ^ »<?m'al m " ^ Oj/ m (Q/37)-Pm"m(Q^7)- (D7) 

m'rn" m 

Using the properties given in Appendix ([a]), one can write the last D-function on the last line as, 

t%„ m (aPy) = D m „ m (-af3~y) (D8) 

= e- ira " Q dl„ m (/3)e- i ^ (D9) 

= e- im " a <t m »(-P)e- im ^ (D10) 

= Di m „(- 7 -(3-a). (Dll) 

Knowing that (—7 — f3 — a) is the inverse rotation of (a/3 7 ) one can write, 

$^i£/ m (a/^)£&, m (a07) = ^^J«WoL»(-7^-«) (D12) 

m m 

= D^ m „(000) = 5 ro / m » (D13) 

So one gets, 

CT* = (D14) 



APPENDIX E: THE CORRELATION MATRIX 



To do fast likelihood analysis with Ct one needs to be able to calculate {Ct) and the correlations {CeCf) fast. Calculating the 
average (Ct) by formula @ using the analytic expression ( p"3| ) for the kernel is not very fast. It turns out that a faster way 
of evaluating the kernel is by using direct integration (summation on the pixelised sphere) and then, as shown in Appendix 
(^), recurrence. By means of an integral, one can then write the at m as (now assuming that no is on the north pole), 



^ at'm' \ G(h)Y e * m (n)Y e , m ,(n)dn 

I'm' ^ 

J2 a e'm' J G{9)\ lm {9)X l , m ,{9)dcos9 J 



Ct >S I) I (! '^' m m 'C 



= ^2a llm 2-K J G{9)\ lm {6)\ t , m {9)d cos 9 

= J2 a i'mh^,£',m), (El) 
e' 

where the last line defines h(£,£',m) and \tm(9) is given by, 

Y lm (0,4>) = A* m (0)e~*™- (E2) 
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Using this form, one gets, 

{dl) = 27Tl5I C/ 'H ft2(£ ' £ '' m) - (E3) 

I' m 

To obtain this expression, fio was on the north pole, but as was shown, the {Ct)s are rotationally invariant, that is (Ce) 
remains the same if one rotates the Gabor window so that it is centred on the north pole. 

When using real CMB data, the observed temperature map is always pixelised. So an integral over the sphere has to be 
replaced by a sum over pixels. In this case, the formula for h(£, £' , m) has to be replaced by 

h(£,l',m) = J2 G ^LK'm*j, ( E4 ) 

3 

where the index j is the pixel number replacing the angle 9 and Aj is the area of pixel j. Using a pixelisation scheme like 
HEALPix (Gorski, Hivon and Wandelt 1998) which has a structure of azimuthal rings going from the north to the south pole 
with N r pixels in ring r and equal area for each pixel Aj = A this can be written as 

N r -1 

h(£,£',m) = A^ 2J GrKmK'm, 
r p— 

= A^N r GrXZ m \i m . (E5) 

r 

Here the sum over pixels is split into a sum over rings r and a sum over the pixels in each ring p. The first sum goes over all 
rings which have 9 < 9c- 

Using this expression for the ae m one can now find the correlation matrix 

) - 2^ (2£+l)(2^ + l) • (E6) 



In this expression one can use relation ( JEip to get, 

(C e C e r) = | jT7^ | ^ ^ {a*L m a L > m a* Km ,a Klm :) (E7) 

mm 1 LL' KK' 

xh(£, L, m)h(£, L', m)h(£', K, m')h(£', K',m) (E8) 
(2^ + 1) (2^ + 1) ^ 

X ^ ^2 [{ a *LmO.L' m ){aK m 'O.K'm') (E10) 
mm 1 LL' KK' 

+ {aLm,aKm')( a L'mClK>m') (EH) 
+ { a Lm a K'm'} { a L'm a Km' 

}] (E12) 

x h(£,L,m)h(£,L',m)h(£',K,m')h(£',K',m'). (E13) 
Clearly the first term is just the product (Ce)(C^/), and the two last terms are equal (using a* Km , — a Km i{— l) m and 

CLK'm' = ("I)'" a*K'm') SO OIle S etS ' 

2 Z) ^Cf£'»(^.Am)fc(ii,L,m)J . (E14) 

This is one of the main results of this paper since the formula allows one to analytically calculate the correlation matrix 
needed for likelihood analysis. Another main result is the recurrence deduced in appendix (Q) which allows fast evaluation of 
the h(£,£',m) functions and thereby this correlation matrix. 

By using the binning of the power spectrum described in equation (pi|), the correlation matrix can be calculated faster 
if it is written as 

Mfi = ^^D b D b , X (b,b',i,j), (E15) 

b b' 

where x(b,b' ,i, j) is given as, 
X(b,b',i,j) 



Ml3 ~ {2£ + l)(2£ 



(2ii + l)(2£j+l) 
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x ^2 ( ^2Bie{£ + l)h(e,£i,m)h(e,£j,m) ] 

m \ leb / 

x [ ^Bll(l+l)h(t,ti,m)h(l,tj,m) ] , (E16) 

which is precomputed. The sums over I here go over the £ values in each specific bin b. One sees that computing the likelihood 
takes of the order (7V bm ) 2 (A?" ln ) 2 operations whereas the precomputation of the factor x(b,b' ,i, j) goes as ^ ma x(-^V ln ) 2 A r m where 
N m is the number of m values used. Note that the multipole coefficients of the beam Be are also included. The reason is that the 
input data is always affected by the beam and this is corrected for by using the beam convolved full sky power spectrum CiBf. 

The sum over m in the expressions for the covariance matrix and (Ct) can be limited. The /i-functions are rapidly de- 
creasing for increasing m for Gaussian and top-hat windows. For Gaussian Gabor windows it seems that one can cut the sums 
over m at N m = 200 to high accuracy. For top-hat windows, the sum should be extended to N m = 400. 



APPENDIX F: INCLUDING WHITE NOISE 

In this appendix the total {Ct) and the correlation matrix including contributions from white noise is found analytically. We 
assume that each pixel j has a noise temperature denoted by rij , with the following properties, 

(rij) = 0, (njrif) = Sjj/tTj, (Fl) 

where Oj is the noise variance in pixel j. Then one has the following expressions for the a< m and C; (we use superscript N for 
noise quantities), 

o? m = ^njY&Aj (F2) 

5 

("Ww') = ^(w^Y^^A,, - E^„^„/Aj (F3) 

55' 5 

m j 

Here Y/ m is the Spherical Harmonic of the pixel centre of pixel j. For the windowed coefficients, one gets similarly, 

5 

® = iE^ A ? (F6) 

5 

The next step is to find the noise correlation matrix, 

= (a + i ; (2f + 1) EEWA, (F7) 

mm' jj'kk' 

x (n^n.n^G^G.GyY^Y^Y^Ypi, (F8) 
= (C?)(C$) + M£,, (F9) 
where Mf t , can be written as, 

Mf t , 



mm' \ J / 

For pixelisation schemes like HEALPix, the expression can be evaluated fast using FFT. This is apparent when one writes 
the sum over pixels as a double sum over rings and pixels per ring. 

JV r -l 

]T A)G]a)Yi m Yi, m ; = AJ m A^ m , £ e^""™ '>A 2 G?a* p . (Fll) 

j r p— 

In the case of an axisymmetric noise model, this expression becomes even easier which is apparent writing this as 
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J2^mA r e , m ,A 2 G*a 2 r e-^ (m - m,) = A; m A;, m AG 2 a 2 = h'(£,£',m). (F12) 



p=0 



G' 



The sum is equivalent to the previous expression for h(£,£',m) (equation E5) with a new window G' r . This motivates the 
definition of h(£,£' ,m,m') such that 

M~fe = ~, T7 r ^ h' 2 (£,£',m.m'), (F13) 

mm' 

where 

ti {£,£', m,m') = A^G'^Y^,, (F14) 

3 

where G'j = AG 2 a 2 p . These function can also be calculated using the recursion which we deduce in appendix (|c|). Note that 
the noise correlation matrix usually is diagonally dominant and calculating only the elements close to the diagonal suffices 
and speeds up the calculations. 

One can then find the total correlation matrix, splitting it up into one part due to signal, one part due to noise and a 
cross term, 

aim = o| m + a^ m (F15) 
C e = -^ l ^{ai m a,r m ) = cf + Cf + Cf (F16) 

m 

{C e ) = <<?!> + <<??) (F17) 
t (ae-mCHm + aJLafm) > (F18) 

m 

where the assumption that there is no correlation between signal and noise was used. One can then see that the correlation 
matrix can be written in a similar manner, 

{C h G tj ) - {C ti ){C tj ) = Mfj + M™ + {Clef.). (F19) 

This is another major result of this paper showing the full correlation matrix of Ci including noise. One can write the cross 
term as, 

(C t C v ) - 4^ {2e+1){2ei + 1) ( F2 °) 

mm' 

a \ ( ai £m a e'm)( a em a e'm) CFZ]) 
(2£+l)(2£' + l) ' 1 1 

m 

where the relation (5f m 2f* m /) = 8 mm i (af m af * m ) was used. From the above, one can see that these two factors can be written 
as, 

(aL4m) = ^Cf»/i(4/',m)M£',r,m), (F22) 
i" 

{a,£ m a t i m ) = S ' Gj Y lm Y t i m A i a[ , (F23) 

i 

= h'(£,£',m). (F24) 
Again using the binning in equation (pi|), one can write the signal- noise cross correlation matrix similar to equation (El 



Ml = (ClCf.) = D bX '(b,i,j), (F25) 

k 

where 
x'(M>j) 



(2£i + l)(2£j + 



{^2 B U( i + 1 )H^ e i' m ) h (^ i j' m )j h'(£i,£j,m). (F26) 

m \ leb / 
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APPENDIX G: DERIVATIVES OF THE LIKELIHOOD 

In the minimisation of the likelihood, one also needs the first and second derivative of the log-likelihood with respect to the 
bin values Di, described in equation (^4|). These can be found to be, 

FiT r)r\ T 

— = T,(A b )+f T h 6 + 2 — f (G1) 

gJgL- = -T,(A 6 A,) + T,(C-^-) + 2h-C-h, (G2) 

-2h?g, - 2h£ g6 - f T ^-^f + 2f T k b , + 2—g, (G3) 
We have used the following definitions, 

h 6 = gc-d (G5) 

» = «*> 

f = C x d (G7) 

k - = inSk (G8) 

Here the derivatives of d are, 

ddi 9{Cti) -1 X^fx^urc r' \2dCy 2 \ fnc ^ 

^ = -w = 2^zJ^ ( ' ,m) w^J' (G9) 

m v L> ' 

90^7 = -^^7 = ^TtLIL^^--) W^ Bt y (G10) 

m x L' 7 

Obviously for our binning, the double derivative of d disappears. 



APPENDIX H: CORRELATION BETWEEN DIFFERENT PATCHES 



Suppose one has two axisymmetric Gabor windows, G A (n) and G B (n), centred at two different positions A and B on the 
sky. Suppose also that the rotation operators D A and D B will rotate these patches so that the centres are on the north pole. 
Considering patch A, one can define, 

G A (n) [D A T(n)] Y tm (n), (HI) 
where G A is the window G A rotated to the north pole. Since T(n) = J^, ae m Yi m (n), one gets that 

D A T(n) = J2 ae ™J2 D l * m Y em ,(n). (H2) 

£m m' 

Here the -D^/ m coefiicients are described in appendix ([a]). One now gets, 

h A {tj',m)5 mm „ (H3) 



£ f m f 

e'm 



/ / at' m 'D t mm>h A {l,t' m), (H4) 



where h A (l,l' , m) is just the h(£,£',m) function for the Gabor window G A (n). 

The next step is to find the correlations between C A and Cf, defined for patch A as, 

Gt = V 5 *" 5 *" . (H5) 
* 2^+1 v ; 
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Following the procedure we used for a single patch one gets, 

lf,Af,B\ _ 1 V~^/~A*~A ~B* -B i 

\yi <^i>) — (2^ + l)(2l' + 1) / . \ a tm a lm a i'm' a t'm'l 

mm' 

= ( c *H> c e)+ {2 e + i) ( 2 *' + i) • (H6) 

One can use the expression for af m to find, 

(o-tmO'tm) ~ ^ ^ ' { a l"m" a L" M")D mm ii D m i M rr (H7) 
£"m" L"M" 

x h A (£,£", m)h B (t,L",m) (H8) 

= E^''(E^''^«") AA ( / '^ m ) hB ^^'' m ') (H9) 

0^' m ,(A) 

= ^C £ »d™' m ,(A)/i A (^/,m)/i B (^,^m'), (H10) 

f" 

where A is the angel between the centres of the patches. Relations from appendix (|a|) were used here. 

The next step is to see what happens when noise is introduced. We assume that the noise is uncorrelated. The noise in 
pixel j is rij and (rijrij>} = SjjXTj. From above one has, 

Cf = C AS + Cf N + Q AX , (Hll) 
where 

-NA-NA* 

Cf N = (H12) 



^AX 



-AN 
a (m 



21+1 

-NA-SA* 



E a lm a lm (HIS^ 
21 + 1 1 ' 

m 

(H14) 



where the last sum is over pixels, G A and n A being the window and noise for pixel j respectively. 
The correlation between the two patches then becomes, 

(CfCf,) - <Cf > (Of,) = M% + AfjJ, + <Cf X C? X ) , (H15) 

where, 



M "' = (2£+l){2£> + 1) ^ K a ^*agL)| 2 (H16) 



»,N 2 I /-AN* -BN \i2 fTJT7\ 

"' = (2l+l)(2l' + l) 2j <aft " ae ' m,) \ ■ (H17) 

mm' 

Finally, 

//VAX^BXi 4 V^Z-ASt-BS w-ANt-BN \ mi o\ 

mm' 

Now one needs an expression for (a^tZ/f/ 1 ^* }• One gets, 

} = G A Gy (n A nf, ) F/ ro Y/, m , . (H19) 
n' 

Here there are only correlations between overlapping pixels. If there are no overlapping pixels between the patches, this term 
is zero. Otherwise this can be written as a sum over the overlapping pixels 

{af^aTm') = J2 G t G ?°i Y LYL> ■ (H20) 



